Selective Sampling with Redundant Views

نویسندگان

  • Ion Muslea
  • Steven Minton
  • Craig A. Knoblock
چکیده

Selective sampling, a form of active learning, reduces the cost of labeling training data by asking only for the labels of the most informative unlabeled examples. We introduce a novel approach to selective sampling which we call co-testing. Cotesting can be applied to problems with redundant views (i.e., problems with multiple disjoint sets of attributes that can be used for learning). We analyze the most general algorithm in the co-testing family, naive co-testing, which can be used with virtually any type of learner. Naive co-testing simply selects at random an example on which the existing views disagree. We applied our algorithm to a variety of domains, including three real-world problems: wrapper induction, Web page classi cation, and discourse trees parsing. The empirical results show that besides reducing the number of labeled examples, naive co-testing may also boost the classi cation accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selective Sampling with Co-Testing: Preliminary Results

We present a novel approach to selective sampling, cotesting, which can be applied to problems with redundant views (i.e., problems with multiple disjoint sets of attributes that can be used for learning). The main idea behind co-testing consists of selecting the queries among the unlabeled examples on which the existing

متن کامل

Selective Sampling With Naive Co-Testing: Preliminary Results

Selective sampling, a form of active learning, reduces the cost of labeling training data by asking only for the labels of the most informative unlabeled examples. We introduce a novel approach to selective sampling which we call co-testing. Co-testing can be applied to problems with redundant views (i.e., problems with multiple disjoint sets of attributes that can be used for learning). We ana...

متن کامل

Light Field Coding based on Flexible View Ordering for Unfocused Plenoptic Camera Images

Plenoptic cameras are devices designed for sampling the light field entering their main lens. The raw information captured by the sensor can then be processed for rendering views at different focal distance or from different perspective. A plenoptic camera generates images with high resolution and high redundant information if compared with a conventional camera having resolution equal to that ...

متن کامل

Modern literary interpretation in understanding the meaning of the verse ‘There is nothing like Him’

Numerous views have been expressed by commentators and writers about the literary aspect and the meaning of the Qurchr('39')anic phrase "There is nothing like Him". The sequence of the words "ka" and "like" in the holy verse, has led to two literary and semantic illusions. The literary illusion is that "ka" seems to be redundant and the semantic illusion is the word ‘like’ indirectly proves the...

متن کامل

Multiple Non-Redundant Spectral Clustering Views

Many clustering algorithms only find one clustering solution. However, data can often be grouped and interpreted in many different ways. This is particularly true in the high-dimensional setting where different subspaces reveal different possible groupings of the data. Instead of committing to one clustering solution, here we introduce a novel method that can provide several non-redundant clust...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000